Clustering Mixed Data via Diffusion Maps
نویسندگان
چکیده
Data clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, customer segmentation, trend analysis, pattern recognition and image analysis. Although many clustering algorithms have been proposed most of them deal with clustering of numerical data. Finding the similarity between numeric objects usually relies on a common distance measure such as the Euclidean distance. However, the problem of clustering categorical (nominal) data is more difficult and challenging since categorical data have nominal attributes. As a result, finding the similarities between nominal objects using common distance measures, which are used for processing numeric data, is not applicable here. Moreover, real applications data have to deal with mixed types of attributes such as numeric and nominal data that reside together. In this paper, we propose a technique that solves this problem. We suggest to transform the input data (categorical and numerical) into categorical values. This is achieved by an automatic non-linear transformations, which identify geometric patterns in these datasets, that find the connections among
منابع مشابه
Diffusion Maps Clustering for Magnetic Resonance Q-Ball Imaging Segmentation
White matter fiber clustering aims to get insight about anatomical structures in order to generate atlases, perform clear visualizations, and compute statistics across subjects, all important and current neuroimaging problems. In this work, we present a diffusion maps clustering method applied to diffusion MRI in order to segment complex white matter fiber bundles. It is well known that diffusi...
متن کاملFuzzy Adaptive Resonance Theory, Diffusion Maps and their applications to Clustering and Biclustering
In this paper, we describe an algorithm FARDiff (Fuzzy Adaptive Resonance Diffusion) which combines Diffusion Maps and Fuzzy Adaptive Resonance Theory to do clustering and biclustering on high dimensional data. We describe some applications of this method.
متن کاملUnsupervised Clustering Using Diffusion Maps for Local Shape Modelling
Understanding the biological variability of anatomical objects is essential for statistical shape analysis and to distinguish between healthy and pathological structures. Statistical Shape Modelling (SSM) can be used to analyse the shapes of sub-structures aiming to describe their variation across individual objects and between groups of them [1]. However, when the shapes exhibit self-similarit...
متن کاملMultimodal diffusion geometry by joint diagonalization of Laplacians
We construct an extension of diffusion geometry to multiple modalities through joint approximate diagonalization of Laplacian matrices. This naturally extends classical data analysis tools based on spectral geometry, such as diffusion maps and spectral clustering. We provide several synthetic and real examples of manifold learning, retrieval, and clustering demonstrating that the joint diffusio...
متن کاملManifold Learning and Dimensionality Reduction with Diffusion Maps
This report gives an introduction to diffusion maps, some of their underlying theory, as well as their applications in spectral clustering. First, the shortcomings of linear methods such as PCA are shown to motivate the use of graph-based methods. We then explain Locally Linear Embedding [9], Isomap [11] and Laplacian eigenmaps [1], before we give details on diffusion maps and anisotropic diffu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009